† Corresponding author. E-mail:
Project supported by the National Natural Science Foundation of China (Grant No. 61271079)
In the procedure of non-invasive fetal electrocardiogram (ECG) extraction, high-quality maternal R wave peak detection demands enhancing the maternal ECG component firstly. Among all the enhancing algorithms, the one based on the continuous wavelet transform (CWT) is very important and its effectiveness depends on the optimization of the used wavelet. However, up to now, there is still no clear conclusion on the optimal wavelet (including type and scale) for CWT to enhance the maternal ECG component of an abdominal ECG signal. To solve this problem, in this paper, we select several common used types of wavelets to carry out our research on what the optimal wavelets are. We first establish big-enough training datasets with different sampling rates and make a maternal QRS template for each signal in the training datasets. Second, for each type of selected wavelets, we find its optimal scale corresponding to each QRS template in a training dataset based on the principle of maximal correlation. Then calculating the average of all optimized wavelet scales results in the mean optimal wavelet of this type for the dataset. We use two original abdominal ECG databases to train and test the optimized mean optimal wavelets. The test results show that, as a whole, the mean optimal wavelets obtained are superior to the wavelets used in other publications for applying CWT to maternal ECG component enhancing.
Almost all promising non-invasive fetal electrocardiogram (ECG) extraction methods[1–8] involve accurately determining maternal ECG R-wave peaks. The high-quality maternal R wave peak detection demands enhancing the maternal ECG component firstly. Researchers have proposed a variety of ECG enhancing algorithms,[9–17] such as the difference method,[9] the Hilbert transform method,[10] the template matching method,[11] the wavelet transform method,[12–17] and so on. Among all the enhancing methods, the wavelet transform method is referred to as a very important one.
Specifically, there are two ways for the wavelet transform method to enhance the maternal ECG component: one[12–14] is to compute the continuous wavelet transform coefficients of an abdominal ECG signal based on a selected wavelet type and a certain scale and then take the modulus of the wavelet transform coefficients as the enhanced signal; the other[15–17] is to compute the wavelet transform coefficients of the abdominal ECG signal based on multiple scales, then take some coefficients on different scales to reconstruct an enhanced ECG signal with larger SNR. Both enhancing methods are affected by the wavelet type and the scale selected, our research will only focus on the wavelet optimization in the first enhancing method above.
Nowadays, when applying continuous wavelet transform (CWT) to enhance the maternal ECG component, most researchers[12–14,18] usually determine the mother wavelet and scale through their experiences or consulting other publications, but do not carry out deep researches about the wavelet optimization. Fard et al.[19] designed a new hybrid wavelet named HCW by combining the complex frequency B-spline wavelet (fbsp) and complex morlet (cmor) wavelet linearly, and proposed to determine the parameters and combination coefficients of the hybrid wavelet by the theory of template matching. However, they did not introduce the parameter determination method in detail and clearly, for instance, they did not explain what kind of standard ECG template database has been taken, and what kind of objective function has been used for the parameter optimization. Their main contribution is raising a new type of wavelet for ECG enhancing, rather than paying enough attention to optimize the parameters.
In this paper, we will first establish two training datasets from two original abdominal ECG databases with different sampling frequencies and make a maternal QRS template for each signal in the training datasets. Then we will select several commonly used types of wavelets,[12–21] including fbsp,[19] cmor,[19] Mexican hat wavelet (Mexh),[12,14] Daubechies wavelet (db),[17] Biorthogonal wavelet (bior),[20] and symlet wavelet (sym),[21] to carry out our research. For each type of wavelet, we will find one optimal scale for each QRS template of a training dataset based on the principle of maximal correlation and then work out a mean optimal wavelet for the whole training dataset. The mean optimal wavelets obtained through big databases should be more suitable for maternal ECG component enhancing statistically. Finally, we will use the two test datasets from the original abdominal ECG databases to test them.
The paper is arranged as follows. In Section
The CWT of a signal x(t) is defined by[14]
Several types of wavelets, fbsp, cmor, Mexh, db, bior, and sym, which are commonly used for ECG enhancing in other publications, are introduced as follows.
The fbsp function is defined as[19]
The other three wavelets are different from the above three. They have no explicit mathematical equations and their wavelet functions ψ(t) are related with scaling functions ϕ(t) through the two-scale difference equation
The waveforms of the six selected wavelets are shown in Fig.
Two abdominal ECG databases are used to build QRS templates and test the optimized wavelets in this paper. They are introduced as follows.
The experimental protocol is described as follows. (i) Let the woman lie down on a bed, clean her abdominal skin with alcohol, and apply saline water at the positions where the electrodes would be pasted to remove the grease on the skin and enhance the conductivity of the electrodes. (ii) Place the ten electrodes according to the first placement scheme, then record 1–2 groups of data with a duration of 24 s. (iii) Adjust the positions of the electrodes according to the second placement scheme and record 1–2 groups of data with a duration of 24 s. (iv) Adjust the positions of the electrodes according to the third placement scheme and record 1–2 groups of data with a duration of 24 s. For every pregnant woman, we collected 3–6 groups of data.
Finally, we totally collected 343 groups of data, and every group of data was comprised of 8-channel abdominal ECG signals with a duration of 24 s.
For each original database, we will establish a training dataset for the wavelet optimization and a test dataset to test the optimization results respectively due to their different sampling frequencies.
Before establishing the training datasets, first, thinking of there being less leads in Database 2, for each multi-channel signal in it, we use the formulas new_sig = sig(i) − sig(j) and new_sig = sig(i) + sig(j) to increase the number of leads, where sig(i) and sig(j) are sub-signals of the multi-channel signals, i > j. Second, to obtain high-quality maternal QRS templates, we select all the effective signals from both abdominal ECG databases, excluding the signals seriously polluted by disturbance and noise. Figure
The selection criterion is based on the template matching method. The details are as follows: first, carry out the initial maternal R wave peak detection on each signal[5] and obtain the corresponding QRS complex template whose duration is set to 0.12 s (this value is selected because the duration of QRS complex is usually less than 0.12 s);[26] second, calculate the correlation coefficients between each detected maternal R wave and the template, then compute the number of the coefficients higher than 0.8, which mean real or high-quality QRS waves based on the experiences of our previous research (in this step, we also record the positions of the maternal R wave peaks whose coefficients are higher than 0.8.); third, if the number is above 10 for Database 1 composed of short-duration data (24 s), or above 30 for Database 2 composed of long-duration data (60 s), the corresponding lead is selected to ensure both the quality and the quantity of signals in the training datasets. Finally we select 2649 leads of effective abdominal signals from Database 1 and 2655 leads from the extended Database 2 to build up the final training sets.
The establishment of the test datasets is much easier than that of the training datasets. First, extend Database 2 as above, and then select the signals whose maternal R wave peaks can be marked correctly no matter by an automatic algorithm[5] or by hand. Three authors from the hospital are experienced in reading ECG waveforms, and they guarantee that all of the R peaks marked by eye are reliable.
Before the optimization, considering that the wavelets may have more than one parameter, firstly we set the other parameters of the wavelets to fixed values and only let the scale be optimized. Since we need to compare our optimization results with the wavelets used in other publications, we select common used wavelets fbsp1-2-1.5,[19] cmor1-1.5,[19] Mexh,[12,14] db4,[17] bior1.5,[20] sym8[21] for optimization. Their waveforms can be seen in Fig.
For a certain kind of wavelet, we will optimize it on each training dataset respectively. The optimization steps are as follows.
(i) Make a maternal QRS template for each signal in the training dataset. First, a pre-filtering method is applied to the signal, see Ref. [5]; second, take the positions of the maternal R wave peaks recorded in the establishment of the training dataset; third, take 0.12 s-length windowed signals whose centers are the detected R peaks and average them to obtain a 0.12 s-length QRS template.
(ii) Take out a QRS template and calculate the correlation coefficients between the wavelets under different scales and the template; the scale corresponding to the maximal correlation coefficient is the optimal scale for this template.
(iii) After taking out all the QRS templates and obtaining their optimal scale respectively as above, average all the optimal scales to generate the mean optimal wavelet.
At the second step above, according to our rough estimation, the maximal value of the optimal scales of the wavelets we used must be lower than 100 for 500 Hz (sampling frequency) data, so we set the searching range of scales for them to 0.5:0.5:100; and for the sampling frequency 1000 Hz, the range is set to 1:1:200 correspondingly. Besides, when the scale is a, the support interval of the wavelet will be a times as when the scale is 1, for example, [−a × 20, a × 20] for fbsp1-2-1.5. As for the sampling frequency of the wavelet, after a thorough study, we set it to 1 Hz, which is the same with cwt.m in Matlab.
To compare our mean optimal wavelets with the wavelets used in the literature, we use two indices of the enhanced signals obtained by CWT to evaluate them.
The first index is defined through the following steps. First, mark the correct maternal R peaks no matter whether by an automatic algorithm[5] or by hand; second, mark the prominent disturbances which may increase difficulties in R wave detection automatically using a very simple peak detector. One example is shown in Fig.
The peak detector consists of two main parts: detect all peaks initially based on max search, and delete the peaks which are lower than the threshold. Its pseudo codes are presented in Appendix
In addition, after performing the peak detection, we should delete the maternal R wave peaks from the results by comparing them with the previous detected maternal R wave peaks. The details are as follows. If the absolute time difference between the detected peak and one maternal R wave peak marked before (confirming with eyes) is lower than a threshold 60 ms, we believe that this peak is corresponding to the maternal R wave and then delete it. The threshold 60 ms is determined by training in advance. Then, the prominent disturbances can be obtained.
After the maternal R wave peaks and disturbances are detected, the first index is defined as follows:
The second index is F1[5] and its definition is
Before showing the optimization results, one detail we should emphasize is that, the optimal scale we obtain directly above is related to the sampling frequency, which is a relative optimal scale. To achieve an optimal scale without the influence of the sampling frequency, we define the absolute optimal scale as K = scale/fs.
After using the six wavelets to implement the optimization experiments on the two training databases, we obtain the optimization results, which are shown in Table
The obtained mean optimal wavelets of the six selected wavelets (fbsp1-2-1.5, cmor1-1.5, mexh, db4, bior1.5, sym8) and the corresponding wavelets used in the literature are tested on the two test databases. Before the test, through a literature review, we obtain the K's values of the five wavelets fbsp1-2-1.5,[19] cmor1-1.5,[19] db4,[17] bior1.5,[20] and mexh[12,14] used in other publications and show them in Table
In the test, when performing the peak detection, we should keep two parameters dilation and thrhd unchanged; dilation = 0.1, thrhd = 0.4. For Database 1, Loop = 10; for Database 2, Loop = 25.
For each test database, firstly, we calculate the NSR for each enhanced signal of the test dataset after performing CWT and then take the sum of the NSRs of all data as the quality indicator of the wavelet. The test results for the sum of the NSRs are shown in Table
Secondly, for each wavelet, F1 is obtained by first computing the TP, FP, and FN of each test dataset, and then calculating the parameter according to Eq. (
From the results, we can find that the effectivenesses of three wavelets fbsp1-2-1.5, bior1.5, Mexh for maternal ECG component enhancing are similar and relatively poorer; sym8, db4, and especially cmor1-1.5 are much better.
By comparing our optimization results with the wavelets in other publications, we can also find that the optimization proposed in our manuscript is effective. Besides, in the whole method, the training datasets establishment and wavelet optimization are automatic and quick, which have made the experiment easy to carry out. Even if the test datasets are established and the test needs people's assistant to help confirm the maternal R wave peaks, it is very simple and easy to do in general.
The first database has a very limited gestation week range (38–40) while the second is not. However, the experimental results have shown that the optimal wavelets have nothing to do with that. The reason may be that the optimal wavelets we obtained are for maternal component enhancing.
Though we only used six types of wavelets as examples, the optimization and the test method for optimal wavelets can be applied to any kind of wavelets and supply more information about the optimal wavelet. Although our experiments are carried on the abdominal ECG databases of pregnant women, the optimization and test method can also be applied to ECG databases of other people. This work will be done in the future.
In the establishment of the training datasets, we have used the template matching method to estimate the quality of the signals. In our opinion, this method is very promising and deserves more attention.
The authors also think that it is significant to introduce the wavelet scale related index K in our paper, since the obtained mean optimal scale described by K has excluded the factor of sampling rate. In practice, it is easy for us to multiply the obtained optimal K with the actual signal sampling frequency to obtain the wavelet scale for use.
We obtain the mean optimal wavelets which are superior to the ones in the other publications when used for maternal ECG component enhancing. As for the wavelet types, the effectivenesses of three wavelets fbsp1-2-1.5, bior1.5, Mexh for maternal ECG component enhancing are similar and relatively poorer; while sym8, db4 and especially cmor1-1.5 are much better.
It is believed that the optimized wavelets in this paper can be used when applying CWT to maternal ECG component enhancing. At the same time, the proposed method to optimize and evaluate wavelets is helpful for similar research.
[1] | |
[2] | |
[3] | |
[4] | |
[5] | |
[6] | |
[7] | |
[8] | |
[9] | |
[10] | |
[11] | |
[12] | |
[13] | |
[14] | |
[15] | |
[16] | |
[17] | |
[18] | |
[19] | |
[20] | |
[21] | |
[22] | |
[23] | |
[24] | |
[25] | |
[26] |